L04 Maps

Data Visualization (STAT 302)

Author

Sherry Chen

Overview

The goal of this lab is to explore various ways of building maps with ggplot2.

Challenges are not mandatory for students to complete. We highly recommend students attempt them though. We would expect graduate students to attempt the challenges.

Datasets

We’ll be using the US_income.rda dataset which should be placed in the /data subdirectory in our data_vis_labs project. You’ll also be downloading your own data to build maps.

Code
# Load package(s)
library(ggplot2)
library(tidyverse)
library(readr)
library(sf)
library(maps)
library(viridis)
library(viridisLite)
library(statebins)
# Load dataset(s)
load("US_income.rda")

Exercise 1

Plot 1

Make a county map of a US state using geom_polygon(). Maybe use your home state or a favorite state. Please do NOT use the state in the ggplot2 book example.

Optional: Consider adding major cities (or your home town).

Hints:

  • See section 6.1 in our book.
  • Void theme
Solution
Code
#il county data:
il_counties <- map_data("county", "illinois") |> 
  select(lon = long, lat, group, id = subregion)
head(il_counties)
        lon      lat group    id
1 -91.49563 40.21018     1 adams
2 -90.91121 40.19299     1 adams
3 -90.91121 40.19299     1 adams
4 -90.91121 40.10704     1 adams
5 -90.91121 39.83775     1 adams
6 -90.91694 39.75754     1 adams
Code
#add hometown or other cities of interest
chicago_city_data <- tibble(
  city_name = "Chicago",
  lon = -87.627778, 
  lat = 41.881944
)

ggplot(il_counties, aes(lon, lat)) +
  geom_polygon(aes(group = group), fill = NA, colour = "grey50") + 
  geom_point(data = chicago_city_data, aes(x = lon, y = lat)) +  # Corrected line
  coord_quickmap() +
  theme_void() +
  ggtitle("Illinois")

Plot 2

Now use geom_sf() instead. You’ll need to download data for this. You can use either the tigris (github page) or geodata packages. Either tigriscounties() with cb = TRUE or geodata’s gadm() could be useful.

Solution
Code
#option using tigris
# il county data using tigris package

il_counties <- tigris::counties(state = "il", cb = TRUE, progress_bar = FALSE) |> 
  janitor::clean_names()


ggplot(il_counties) + 
  geom_sf(aes(geometry = geometry), fill = NA) +
  theme_void()+
  ggtitle("Illinois")

Code
#option 2 using geodata 
#il countie using geodata package

usa_counties <- geodata::gadm(country = "USA", level = 2, path = tempdir()) |> 
  st_as_sf() |> 
  janitor::clean_names()
il_countiesdat <- usa_counties |> 
  filter(name_1 == "Illinois")

  
ggplot(il_countiesdat) + 
  geom_sf(aes(geometry = geometry), fill = NA) +
  theme_void()+
  ggtitle("Illinois")

Exercise 2

Using the US_income dataset, recreate the following graphics as precisely as possible.

Code
# Setting income levels
US_income <- mutate(
  US_income,
  income_bins = cut(
    ifelse(is.na(median_income), 25000, median_income),
    breaks = c(0, 40000, 50000, 60000, 70000, 80000),
    labels = c("< $40k", "$40k to $50k", 
               "$50k to $60k", "$60k to $70k", "> $70k"),
    right = FALSE
  )
)

Plot 1

Hints:

  • geom_sf() — boundary color is "grey80" and linewidth is 0.2
  • viridis package (discrete = TRUE in scale_* function)
  • Void theme
Solution
Code
# Setting income levels
US_income <- mutate(
  US_income,
  income_bins = cut(
    ifelse(is.na(median_income), 25000, median_income),
    breaks = c(0, 40000, 50000, 60000, 70000, 80000),
    labels = c("< $40k", "$40k to $50k", 
               "$50k to $60k", "$60k to $70k", "> $70k"),
    right = FALSE
  )
)

# US median income heat map 
ggplot(US_income, aes(geometry = geometry)) +
  geom_sf(aes(fill = income_bins), color = "grey80", linewidth = 0.2) +
  theme_void() +
  labs(
    fill = "Median\nIncome"
  ) +
 scale_fill_viridis(discrete = TRUE)

Plot 2

Hints:

  • statebins::geom_statebins()
  • viridis package (discrete = TRUE in scale_* function)
  • Statebins theme
Solution
Code
# Setting income levels
US_income <- mutate(
  US_income,
  income_bins = cut(
    ifelse(is.na(median_income), 25000, median_income),
    breaks = c(0, 40000, 50000, 60000, 70000, 80000),
    labels = c("< $40k", "$40k to $50k", 
               "$50k to $60k", "$60k to $70k", "> $70k"),
    right = FALSE
  )
)

ggplot(US_income) +
  geom_statebins(aes(state = name, fill = income_bins)) +
  theme_statebins() +
  labs(
    fill = "Median\nIncome"
  ) +
 scale_fill_viridis(discrete = TRUE)

Exercise 3

Pick any city or foreign country to build a map for. You can dress it up or make it as basic as you want. Also welcome to try building a graphic like that depicted at the end of section 6.5 — use a different region though.

Solution
Code
# Corrected region name
nm_counties <- map_data("county", "new mexico") %>% 
  select(lon = long, lat, group, id = subregion)

head(nm_counties)
        lon      lat group         id
1 -106.2436 35.21972     1 bernalillo
2 -106.2436 35.02491     1 bernalillo
3 -106.2436 35.02491     1 bernalillo
4 -106.2436 34.95043     1 bernalillo
5 -106.1462 34.95043     1 bernalillo
6 -106.1462 34.85875     1 bernalillo
Code
#add hometown or other cities of interest
new_mexico_data <- tibble(
  city_name = "Chicago",
  lon = -105.964444, 
  lat = 35.667222
)

ggplot(nm_counties, aes(lon, lat)) +
  geom_polygon(aes(group = group), fill = NA, colour = "grey50") + 
  geom_point(data = new_mexico_data, aes(x = lon, y = lat)) +  # Corrected line
  coord_quickmap() +
  theme_void() +
  ggtitle("New Mexico")

Challenge(s)

Important

It is optional for everyone, but we highly encourage students to give it a try. Several students in past versions of this course use mapping data like this to for their project. Usually using the leaflet or mapview packages. This one is done with mapview, but as an additional challeneg students could try doing this with leaflet too.

Using the tidycensus package and few others, try to create a map like below using these directions. Try using a different geographical area and a different variable from the ACS.

Solution

YOUR SOLUTION HERE